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Abstract 

The rapid advance of sequencing technology, coupled with improvements in molecular methods for obtaining genetic 
data from ancient sources, holds the promise of producing a wealth of genomic data from time-separated individuals. 
However, the population-genetic properties of time-structured samples have not been extensively explored. Here, we 
consider the implications of temporal sampling for analyses of genetic differentiation and use a temporal coalescent 
framework to show that complex historical events such as size reductions, population replacements, and transient 
genetic barriers between populations leave a footprint of genetic differentiation that can be traced through history 
using temporal samples. Our results emphasize explicit consideration of the temporal structure when making inferences 
and indicate that genomic data from ancient individuals will greatly increase our ability to reconstruct population 
history. 
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Introduction 

Recent advances in molecular genetics have opened up the 
possibility of using temporal genetic samples to answer bio- 
logical questions, including studies focusing on viruses 
(Rodrigo and Felsenstein 1999) and studies of animal and 
human remains (Shapiro and Hofreiter 2014). DNA extraction 
from fossils or ancient material was pioneered some 3 decades 
ago (Higuchi et al. 1984; Paabo 1985), but the field of ancient 
DNA has been plagued by problems such as contamination 
from modern-day DNA, postmortem DNA damage, and low 
levels of endogenous DNA. However, many problems have 
been resolved in the last few years. For example, the high 
frequency of postmortem damage in ancient DNA sequences 
(Briggs et al. 2007) can be difficult to distinguish from biolog- 
ical polymorphisms, but experimental solutions have been 
developed, such as using enzymes to repair damaged nucleo- 
tides (Briggs et al. 2010). Likewise, problems arising from con- 
tamination from present-day individuals can be 
circumvented using these same postmortem damage pat- 
terns (Krause et al. 2010; Meyer et al. 2014; Skoglund et al. 
2014), coupled with an assessment of whether the DNA orig- 
inates from a single individual (Green et al. 2010; Krause et al. 
2010). These advances have resulted in a remarkable devel- 
opment, exemplified by the explosion in genomic studies of 
ancient hominid remains such as the sequencing of the 
Neandertal genome (Green et al. 2010; Prufer et al. 2014), 
the Denisova genome (Reich et al. 2010; Meyer et al. 2012), 
and genomic investigations of several prehistoric humans 
(Rasmussen et al. 2010; Keller et al. 2012; Sanchez-Quinto 



et al. 2012; Skoglund et al. 2012; Raghavan et al. 2014). 
There are even isolated examples of DNA preservation in 
fossils that are hundreds of thousands years old (Orlando 
et al. 2013; Meyer et al. 2014). The new sequencing technol- 
ogies have been instrumental for this development simply 
because they work with massive amounts of short-frag- 
mented DNA, which is the state in which we find postmor- 
tem DNA. 

Theoretical aspects of temporal genetic differentiation 
have not been extensively investigated even though many 
of the classical population-genetic parameters, such as 
Wright's F-statistics (Wright 1949), stem from temporal 
models. For example, temporal differences between ancient 
samples, as well as between ancient samples and modern-day 
ones, complicate interpretations of population-genetic struc- 
ture. Even in the absence of population structure, genetic drift 
is expected to produce genetic differences between genetic 
data from different points in time (Krimbas and Tsakas 1971; 
Waples 1989; Nordborg 1998; Anderson et al. 2000; Wang 
2001; Berthier et al. 2002; Beaumont 2003; Depaulis et al. 
2009; Nystrom et al. 2012), which in practice makes separating 
historical scenarios of replacement and genetic drift difficult 
(Nordborg 1998; Serre et al. 2004; Haak et al. 2005; Castroviejo- 
Fisher et al. 2011; Sjodin et al. 2014). However, it may be de- 
sirable to use the temporal structure within a sample to make 
inferences, because time-structured data offer a new dimen- 
sion of information for learning about the demographic his- 
tory. That important information can be extracted from 
temporal samples is illustrated by the long tradition of using 



© The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. 
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http:// 
creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, 
provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com 



Open Access 



2516 



Mol. Biol. EvoL 31 (9):251 6-2527 doi:10.1093/molbev/msu192 Advance Access publication June 17, 2014 



Temporal Population-Genetic Structure • doi:10.1093/molbev/msu192 



MBE 



variance in allele frequencies between multi-individual sam- 
ples from discrete time points to infer effective population size 
(Krimbas and Tsakas 1971; Waples 1989; Anderson et al. 2000; 
Wang 2001; Berthier et al. 2002; Beaumont 2003) and methods 
for using single-locus nonrecombining markers, such as mito- 
chondrial DNA, to infer population size changes (Drummond 
et al. 2005; Ramakrishnan et al. 2005; Chan et al. 2006; 
Drummond and Rambaut 2007; Ramakrishnan and Hadly 
2009; Navascues et al. 2010; Ho and Shapiro 2011). 
Furthermore, the coalescent model (Kingman 1982) is readily 
adapted to accommodate time-serial samples (Rodrigo and 
Felsenstein 1999) and several simulation tools that handle 
temporal samples have been developed (Anderson et al. 
2005; Jakobsson 2009; Excofner and Foil 2011). However, the 
use of genomic data from temporal samples for inferring more 
complex population histories remains largely unexplored. As 
the quality and quantity of ancient genomic data is increasing, 
we need a better understanding of how temporal structure 
affects genetic differentiation and diversity. In this article, we 
first illustrate how temporal structure relates to classical 
models of population structure by calculating Wright's fixa- 
tion index, F ST , in simple demographic models, which provides 
an intuitive understanding of the problem at hand. Second, 
we demonstrate that genetic data from temporal samples can 
greatly aid inferences of population history by highlighting 
several instances where wide temporal sampling can provide 
insights that would be hard to obtain otherwise. 

Fundamental Properties of Temporal Genetic 
Structure 

Genetic drift results in differentiation between structured 
populations (Wright 1940, 1951). In a coalescent framework 
(Kingman 1982; Hudson 1990; Slatkin 1991), genetic differen- 
tiation between populations can be viewed as the effect of a 
shorter expected time of coalescence for lineages from the 
same population F[T W ] compared with the expected time of 
coalescence for lineages from different populations E[T B ]. A 
fundamental metric of genetic differentiation in structured 
populations is Wright's fixation index F ST which, in coalescent 
terms, corresponds to 1 — [E[T W ]/((E[T W ] + F[T B ])/2)], where 
£[T W ] and E[T B ] are averaged across populations and com- 
parisons (Slatkin 1991). Taking mutations into account, this 
can be expressed in terms of probabilities of identity by de- 
scent (IBD) such as F ST = (f w - / b )/(1 -f b ). Here,/ b is the prob- 
ability of IBD for lineages picked from different populations 
and/ w is the probability of IBD for lineages picked from the 
same population (averaged over the different populations). 
For instance, if/, and/ 2 are the probabilities of IBD in two 
different groups 1 and 2 



Fst = 



0-5(fi +h) ~fb 
1"/b 



0) 



In this article, we consider F ST for models where samples are 
drawn from two time points and compare this situation to a 
model where the two samples are drawn from different pop- 
ulations that diverged at some point in the past (fig. 1). 



T=0.75 




Fig. 1. Additivity of genetic drift can result in equivalent genetic differ- 
entiation (F ST ) under temporal structure and population divergence. (A 
and B) Thirty individuals are sampled from two populations (15 individ- 
uals from each population) that diverged at a given time in the past. In 
(A), both samples are taken at the present, and in (B), one of the samples 
is taken at 0.5 x 2N e generations before present and the other sample is 
taken at present. (C) Thirty individuals are sampled from two discrete 
time points (15 individuals from each time point) in the history of a 
continuous population. In all three scenarios, the total time that passes 
between each sample is T=2N e generations. The 15 individuals in each 
sample are illustrated as a series of red circles or a series of blue circles. 



If the population size N is constant, the probability of IBD 
in both the temporal model and the divergence model for 
lineages picked from the same population is 



/i =h = 



(2N)" 



1 



(2N)"'+2n 1+1 



(2) 



where ji is the mutation rate per site per generation and 
0 - 4Nu, This is simply the probability that two lineages co- 
alesce before a mutation occurs (2ji is the mutation rate in 
the two lineages [ignoring u 2 terms] and (2N) _1 is the coa- 
lescence rate). As for the probability of IBD between popula- 
tions, in the divergence model (fig. 1A) it is 



A = 0 - - »T 



(2N)" 



exp(-0(7 1 + T 2 )/2) 



(2N)" 1 + 2\i 



1+1 



(3) 



where = t-,/2N and T 2 = t 2 /2N and t-, and t 2 are the times (in 
generations) to the split of the two populations. This expres- 
sion is derived from considering that neither lineage can have 
a mutation before they reach the ancestral population, and 
once in the ancestral population, they must coalesce before a 
mutation occurs (as above). Applying the same argument for 
the temporal model (fig. 1C), two lineages sampled t gener- 
ations apart will be IBD if there is no mutation in the younger 
lineage during t generations and, once in the ancestral pop- 
ulation, the two lineages coalesce before any of them mutate. 
Hence, 



fb = 0 - »y 



(2N)" 1 



(2N)" 1 +2\i 



exp(-9T/2) 
1 +9 ' 



(4) 



where T = t/2N. If T= T q + T 2 then F ST in the temporal and 
divergence model is the same and 



Fst = 1 



+ 1 - exp(-9T/2) 



(5) 
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Fig. 2. Dependence of F ST in temporal and divergence models conditioning on the allele being polymorphic. (A) F ST as a function of T, the total time 
that separates two populations (two times the population divergence time) or the time that separates two samples in model of samples taken at two 
different time points. The gray line shows the function F ST = 1 — e~ T/2 (Nei 1973). (B) The models used for simulating population-genetic data and 
computing F ST . The split model illustrates a population split T/2 time units in the past and the temporal model that illustrates a single population (of 
constant size) from which samples have been taken at two time points. Arrows point to where, in time, sites have been ascertained for variation (see 
main text for a full description of the procedure). 



Note that this extends naturally also for models with 
both divergence and temporal samples, such as the model 
in figure 1B. However, this simple relationship between 
temporal structure and divergence models only holds 
when the population size is constant. When the population 
size is not constant, F ST in the temporal and divergence 
models is expected to be equal only under very specific con- 
ditions (see supplementary fig. S1, Supplementary Material 
online). 

Nei's Estimator of Divergence Time between 
Populations 

Based on a result from Nei (1973), it is commonly stated that 
expected F ST in a divergence model with constant size equals 
1— e -772 (letting T denote the total time in coalescent units 
that separates two populations as above). This result was 
derived under a very specific model assumption — namely 
that all polymorphisms were present in the ancestral popu- 
lation. Furthermore, this only applies when sampling times 
are equal, because for a temporal model where polymor- 
phisms were present in the ancestral population, we find 
instead that F sx = (1— e~ T )/2 (supplementary material, 
Supplementary Material online). Curiously, simulations high- 
light the generality with which F ST responds to genetic drift 
under constant-size scenarios, because Nei's case with ascer- 
tainment of polymorphic loci in the ancestral population of 
the divergence model (F ST = 1 — e~ Tn ) corresponds to the 
temporal case if ascertainment of polymorphisms is per- 
formed at the midpoint between the two temporal 
samples (fig. 2). Likewise, the expectation when polymor- 
phisms are ascertained in the ancestral population of the 
temporal model (F ST = (1 — e _7 )/2) corresponds to ascertain- 
ing in one of the two populations of the divergence model 
(fig. 2). 



F ST and the Combined Effect of Migration and 
Temporal Structure 

We study the effect of migration by considering a simple 
island/stepping-stone model with two populations/demes 
of equal size N and a symmetric migration rate, m, between 
them and with the two populations being sampled t gener- 
ations apart. In this case, F ST can be shown to be 

29 

ST_ ~ [ 20 + 1 - exp(-9T/2) ~~ ~ 

+ ^T7U [1 + exp( " (e + 4 ^) T /2)] 
I 6 + 4A/1 J 

(6) 

where 6 - 4Nji, A/1 = 2Nm, and T=t/2N (see supplementary 
material, Supplementary Material online). As A/1 increases, this 
expression converges to the formula for F ST in a pure tempo- 
ral model with a population of constant but twice as large 
effective population size (so that the scaled mutation rate is 
larger by a factor 2, whereas the scaled time is half as large, 
compare to eq. 5 above). Intuitively, increasing the migration 
rate lowers F ST> whereas an increase in time between the 
sampled time points increases F ST (fig. 3). Importantly, for a 
fixed value of F ST (and 0), there is no definite solution in terms 
of A/1 because this will depend on T, so that F ST is not a direct 
measure of migration rate under this model unless T is known 
(which requires that N and t are known). This is similar to the 
difficulty associated with differentiating between population 
split time and migration in spatial divergence models (Nielsen 
and Wakeley 2001). 

Results 

The simple theoretical models considered above indicate that 
both temporal structure and spatial structure affect F ST in a 
rather similar manner but that their effects are sufficiently 
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Fig. 3. Dependence of F ST in a simple temporal island model. The 
model consists of two equally sized populations with symmetric migra- 
tion rate. The X axis shows the separation in (scaled) time between the 
samples. 6 (the scaled mutation rate) is set to 0.1. The continuous lines 
show how F ST for models with different (scaled) migration rates depends 
on time separation. The dotted lines show F ST if the samples are not 
separated in time or, equivalently, if the effective population size is 
infinitely large. The red line is the limit when migration is infinitely 
fast. In this case, the model is identical to a purely temporal model 
with a doubled (and constant) population size (the larger population 
size implies that 6 is twice as large). The dark blue line is the limit when 
there is no migration while the light blue line at log(M)=0 is included for 
reference. 



different to prompt caution in interpretations of F ST , in par- 
ticular for cases where temporal samples are involved. To 
investigate more complex scenarios of continuous sampling 
over time, we now turn to a simulation-based approach. 

Stepping-Stone Migration Model with Temporal 
Samples 

In contrast to isolation models, stepping-stone migration 
models — where populations (demes) are connected in one- 
or two-dimensional landscapes — typically result in continu- 
ous differentiation between individuals rather than discrete 
genetic clusters of individuals (Novembre and Stephens 2008; 
Engelhardt and Stephens 2010). Given that temporal genetic 
structure also affects expected coalescence times between 
lineages, it can be expected that temporal differentiation 
would display similar behavior. To demonstrate this phenom- 
enon, we designed a temporal simulation algorithm (see 
Materials and Methods) based on Hudson's "ms" coalescence 
simulation software (Hudson 2002) and simulated a model 
with 100 demes in a two-dimensional habitat (10x10 lattice) 
with stepping-stone migration. We used 4N e m = 2, where m is 
the fraction of each subpopulation made up of new migrants 
each generation (note that scaling in ms is slightly different to 
the theory above) and sampled one haploid individual from 
each deme at ten time points separated by t = 4N e genera- 
tions, creating a three-dimensional model comprising the two 



spatial dimensions and the temporal dimension (fig. 4A). 
Because of the increased complexity of the data, pairwise 
comparisons such as F ST are poorly suited to analyze the 
results. Instead, we used principal component analysis 
(PCA) to summarize and visualize the resulting population- 
genetic data (see Materials and Methods). PCA and F sx have 
strong conceptual connections, with principal components 
(PCs) being closely related to the average coalescent times 
between pairs of haploid genomes (McVean 2009). We find 
that the first three PCs mirror the three dimensions of the 
model (three-dimensional Procrustes correlation: 0.984, 
P < 10 -5 ) (fig. 4B). Specifically, PC1 and PC2 represented iso- 
lation-by-distance in the two-dimensional habitat, whereas 
PC3 represented temporal differentiation (supplementary 
fig. S2, Supplementary Material online), but this order of 
PCs will depend on the relative magnitudes of the scaled 
migration rate and genetic drift between time points 
(McVean 2009). 

Temporal Genetic Differentiation Can Be Informative 
about Complex Population Histories 
As illustrated in figure 1C, genetic differentiation can also 
occur in the absence of any spatial structure, that is, in sam- 
ples taken at different time points from a single continuous 
(unstructured) population. To investigate temporal differen- 
tiation more closely, we simulated a single continuous pop- 
ulation with an effective population size of 5,000 diploid 
individuals and a generation time of 25 years, sampling 20 
diploid individuals from the present, and an additional 20 
diploid individuals evenly distributed over the period 500- 
10,000 years ago with a 500-year interval between each sam- 
pled individual (fig. 5A). In a PCA, we see that PC1 captures 
the temporal genetic differentiation, separating the samples 
from the most recent to the most ancient as a monotonic 
(but not linear) cline, where individuals close in time are also 
more genetically similar (fig. 5D and supplementary fig. S3, 
Supplementary Material online). To investigate the effect of 
population size fluctuations (i.e., fluctuations in the magni- 
tude of genetic drift), we reduced the population to a tenth of 
its original size between 5,000 and 5,500 years before present. 
Under this sampling scheme, the bottleneck is easily detected 
as a discontinuation in the monotonic cline (fig. 5B and E and 
supplementary fig. S3, Supplementary Material online). 

We also simulated population-genetic data under a diver- 
gence model of two populations that diverged 10,000 years 
ago (fig. 5C). Ten ancient individuals were sampled at different 
time points between 10,000 and 5,500 years from one popu- 
lation, and 30 individuals were sampled from the other pop- 
ulation, between 5,000 years ago and the present (ten ancient 
individuals spread out in time and 20 present-day individuals). 
This simulation could correspond to a scenario where the 
older population was replaced with new colonizers from an- 
other population. In the simulated data, individuals sampled 
before the replacement event show a trajectory along PC1 
through time that is angled away from the individuals in the 
population that replaced the previous population (fig. 5F and 
supplementary fig. S3, Supplementary Material online). In 
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Fig. 4. Genetic differentiation mirrors the sampling scheme in a model with both temporal and spatial structure. (A) Two-dimensional stepping-stone 
migration model from which ten haploid individuals were sampled at different historical time points. (B) PCA of data generated under the model. Large 
symbols correspond to high PC1 and PC3 values but low PC2 values. 
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Fig. 5. Temporal sampling distinguishes genetic drift from population structure. (A) Constant population size model. (B) Bottleneck model. (C) 
Replacement model. (D) PC1 stratified by sample time under the constant population size model. (£) PC1 stratified by sample time under the 
bottleneck model. (F) PC1 stratified by sample time under the replacement model. Each colored circle corresponds to a single-sampled individual 
except for the large circles at time zero which corresponds to 20 sampled individuals in A, B, and C (in D, E, and F, the 20 individuals sampled at time zero 
end up on top of each other). F ST between samples from before and after the bottleneck/replacement events at 5,500 years ago fails to distinguish 
between the models (F ST = 0.01 54 ± 0.0003 and 0.01 53 ± 0.0003, respectively, see fig. 6). 



contrast, in the bottleneck scenario, the sampled individuals 
from before the bottleneck event show a trajectory along PC1 
as a function of time that is angled toward the individuals in 
the descendant population (fig. 5E and supplementary fig. S3, 
Supplementary Material online). However, F ST between the 
ancient individuals from before and after the event was in- 
distinguishable under the bottleneck model and the replace- 
ment model (0.01 54 ±0.0003 and 0.01 53 ± 0.0003, 
respectively; fig. 6). To complement the PCA approach, we 



reconstructed maximum-likelihood trees (supplementary fig. 
S4, Supplementary Material online) using the covariance in 
allele frequencies between individuals (Pickrell and Pritchard 
2012) and other pairwise F ST comparisons (fig. 6). This analysis 
gave different results depending on the way the samples were 
obtained. The two scenarios were (again) indistinguishable if 
the samples were grouped into three separate temporal sam- 
ples. In contrast, if the full temporal structure was accounted 
for so that each sample was treated independently, the 
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Fig. 6. Genetic differentiation between temporal sample groups. (A) F ST computed on aggregated sample groups is unable to differentiate the 
bottleneck and replacement models. "Moderns": 20 samples from time 0. "Young": 10 samples from time 0-5,000 years ago. "Old": 10 samples 
from 5,500 to 10,000 years ago. (B) F ST between individuals adjacent in time is able to detect a sudden increase in F ST between the pair of individuals that 
flank the demographic event (both bottleneck and replacement), but we are unable to separate the replacement and bottleneck scenarios. Standard 
errors are not shown but ranged between 0.002 and 0.003. (C) F ST between 20 modern individuals and each ancient individual. Standard errors are not 
shown but ranged between 0.0010 and 0.0014. 



maximum-likelihood trees revealed a difference between the 
bottleneck model and the replacement model. These obser- 
vations illustrates that many inference tools can lead to in- 
correct conclusions for temporally sampled data, and they 
emphasize the importance of considering detailed temporal 
sampling structure for distinguishing between bottleneck and 
replacement models. It also illustrates that the considerable 
power to distinguish different models that we report is not 
directly linked to the use of PCA methods but is mainly due to 
the temporal sampling schemes. 

Transient Genetic Barriers 

To study more complex population models, we simulated a 
population split model which involved two populations (A 
and B) that diverged 8,000 years ago. We kept the same sim- 
ulation parameters and temporal sampling scheme as above 
but assigned the 4 ancient individuals from 3,500, 4,500, 5,500, 
and 6,500 years ago to population B and the remaining 16 
ancient individuals to population A (fig. 7A). Strikingly, the 
population split event is readily identifiable when PC1 is strat- 
ified by sampling time (fig. 7C and supplementary fig. S3, 
Supplementary Material online). In a further modification 
of the model, we simulated secondary admixture between 
the two populations 3,000 years before present and where 
75% of the genetic material of the recent population was 
contributed by population A and 25% was contributed by 
population B (fig. IB). A plot of PC1 versus sampling time 
shows the two series of individuals represented by samples 
from the two populations becoming more similar as time 
approaches the time of admixture (fig. ID and supplementary 
fig. S3, Supplementary Material online), suggesting that tran- 
sient genetic barriers can be investigated using continuous 
temporal genetic data. 

Approximate Bayesian Computation Using Temporal 
Genetic Differentiation 

The observation that some statistics of temporal genetic dif- 
ferentiation can recapitulate population history suggests that 



those statistics can be used to infer population history in 
more formal settings. We used approximate Bayesian com- 
putation (Tavare et al. 1997; Pritchard et al. 1999; Beaumont 
et al. 2002) to exemplify that temporal genetic data can be 
used to infer parameters of a demographic model based solely 
on PC1 loadings of sampled individuals as summary statistics. 
We applied this approach to a data set consisting of 44 
Siberian Woolly Mammoth samples spanning 50,000 years 
and genotyped at four microsatellites (Nystrom et al. 2012). 
The original study used conventional summary statistics and 
aggregated temporal groups to show that a population size 
reduction during the Holocene transition could explain the 
fact that two temporal groups were genetically differentiated. 
Here, we expand the inference to a three-parameter model 
(fig. 8): The time of change, the effective population size 
before the change, and the effective population size after 
the change, allowing for either a reduction or expansion in 
population size (see Materials and Methods). The estimated 
posterior distribution indicates a size reduction at around 
11,200 years ago with the effective population size in the 
more recent time period being approximately ten times smal- 
ler than before the change (table 1 and fig. 8). The inferred 
timing of this population size reduction coincides with the 
isolation of Wrangel Island from the Siberian mainland 
(Vartanyan et al. 1993) and thus corroborates the hypothesis 
that this restriction of the habitat triggered a founder event in 
the resident mammoth population (Nystrom et al. 2010, 
2012). 

Pitfalls When Comparing Ancient Genomes to 
Modern Populations 

A common situation is that a single ancient genome is avail- 
able from a certain time point, and the goal is to investigate 
the historical relationship between the ancient individual and 
present-day populations. To investigate the differentiation 
between a single ancient genome and more recent popula- 
tions, we simulated ten individuals from each of two popu- 
lations (A and B) which diverged 20,000 years ago 



2521 



Skoglund et al. • doi:10.1093/molbev/msu192 



MBE 









o o o o 





B 



0 0 0 



O O 0 0 




~i 1 1 1 r 

0 2000 4000 6000 8000 
Age (years b.p.) 



CM 



O 
O 




1 1 1 1 1 r 

0 2000 6000 10000 

Age (years b.p.) 



Fig. 7. Temporal sampling can be used to detect transient genetic barriers. (A) Split model. (B) Split-admixture model. (C) PC1 stratified by time for 
data simulated under the split model. (D) PC1 stratified by time for data simulated under the split-admixture model. Each colored circle corresponds to 
a single sampled individual except for the large circles at time zero, which corresponds to 20 sampled individuals in A and B (in C and D, the 20 
individuals sampled at time zero overlap). The 4 ancient individuals from 3,500, 4,500, 5,500, and 6,500 years ago (marked circles) were sampled from 
population B (bottom population in the model illustrations) and the remaining ancient 16 individuals were sampled from population A (top population 
in the model illustrations). 
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Fig. 8. Approximate Bayesian Computation of Woolly Mammoth demographic history using PC1 loadings as summary statistics. (A) Plot of PC1 versus 
the age of the mammoth individuals. (B) Illustration of the three-parameter model of instantaneous size change. (C) Estimated posterior distribution for 
the time of size change. (D) Estimated posterior distributions of effective population size before and after the size change. Prior distributions in (C) and 
(D) are shown by the gray lines. 
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Table 1. ABC Inference of Northeast Siberian Woolly Mammoth 
Demographic History Using PC Loadings as Summary Statistic. 



Parameter Prior (Uniform) Posterior Posterior 

Mode 95% CI 



N e before change 


200- 


-50,000 


23,500 


16,900-29,400 


N e after change 


200- 


-50,000 


1,800 


1,000-8,300 


Time of change 


3,000- 


-40,000 


11,200 


5,100-23,100 


(years ago) 











Note. — ABC approximate Bayesian computation. N e is the effective population size. 



(N e = 10,000) and a single 18,000-year-old individual from the 
lineage leading to population B (fig. 9). Using PCA, we found 
that PC1 captures the spatial differentiation between popu- 
lations A and B, whereas PC2 captures the temporal differen- 
tiation between the ancient sample and the modern sample 
(fig. 9C). The ancient sample appears closer to population B, 
recapitulating the population history. However, when we 
modified the model to include a 10-fold population size re- 
duction in population B after the time of sampling of the 



B 



past 
A 



Ten -fold 
reduction of 
population size 



present 



pop A 



pop B 



pop A 



pop B 





0.0056 C = -0.0049 
0.0049 D = -0.0056 





C = -0.079 C = -0.055 
D= 0.055 D = -0.079 



C= 0.0049 
D = -0.0007 



C= 0.055 
D = -0.024 



Fig. 9. Comparing a single ancient genome to modern populations. (A) Population divergence model with constant effective population size. (B) 
Population divergence with a 10-fold population size reduction postdating the ancient individual. (C) PCA of 100,000 SNPs simulated under the model 
in (A). (D) PCA of 100,000 SNPs simulated under the model in (B). (£) Population topology inferred using C tests and D tests based on 100,000 
independent SNPs simulated under the model in (A), and (F) population topology inferred using C tests and D tests based on 100,000 independent 
SNPs simulated under the model in (B). Values for the C-statistic are only positive for the correct topology, and absolute values of the D-statistic are 
lowest for the correct topology. The tree topologies displayed in (£) and (F) represent the three possible topologies tested and the larger trees represent 
the true topology (and also the one supported by the statistics). The gray circles in (£) and (F) represent an outgroup individual constructed from the 
ancestral alleles of each simulated locus. For details on these tests, see Materials and Methods. 
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ancient genome (15,000 years ago; fig. 9B), the ancient sample 
instead clustered closer to population A (fig. 9D), despite the 
fact that the ancient individual was sampled from the popu- 
lation that is ancestral to the extant sample from population 
B. This pattern is due to the fact that less time (on the coa- 
lesced scale) has passed between the ancient sample and the 
extant sample from population A, and the genetic differenti- 
ation between the ancient individual and the extant sample 
from population A (F ST = 0.030 =b 0.001 ) was also smaller than 
the genetic differentiation between the ancient sample and 
the extant sample from population B (F ST = 0.055 ± 0.001 ). 
Thus, if the demographic history was unknown, one could 
possibly mistakenly conclude that the ancient sample shares a 
more recent genetic history with population A, solely due to 
the different magnitudes of genetic drift. Indeed, the param- 
eter of historical interest is often the degree of shared history, 
that is, the amount of shared genetic drift and not the relative 
degrees of differentiation. Accordingly, we were able to iden- 
tify the correct topology (fig. 9E and F) using concordance 
tests (Schlebusch et al. 2012; Skoglund et al. 2012) and D 
statistics (Reich et al. 2009; Durand et al. 2011; Patterson 
et al. 2012) that are less sensitive to lineage-specific genetic 
drift. 

Discussion 

The main insight that arises from our analyses is that wide 
temporal sampling provides information that can be hard to 
attain using modern-day data alone or more clustered tem- 
poral groups. The importance of wide temporal sampling 
could also explain previous results suggesting that not 
much statistical power is gained solely by adding one or a 
few temporal sample groups (Mourier et al. 2012). Spatial 
sampling structure can also have a substantial impact on 
inferences of population history using modern-day data 
(Serre and Paabo 2004; Rosenberg et al. 2005; Chikhi et al. 
2010; DeGiorgio and Rosenberg 2013) in which case differen- 
tiating between the relative contributions of migration and 
genetic drift because population divergence is a serious chal- 
lenge (Nielsen and Wakeley 2001). In contrast to the many 
similarities between spatial and temporal structure that we 
have highlighted, the possibility of migration in the different 
dimensions represents a fundamental difference, because mi- 
gration of lineages is not possible in the temporal dimension 
(except in the case of overlapping generations or seed bank 
models, see Kaj et al. [2001]), resulting in a more constrained 
set of models that may be consistent with a particular pattern 
of genetic variation. 

One of the enduring challenges in population-genetic anal- 
ysis of ancient DNA is whether some observed level of genetic 
differentiation between temporal sample groups is the result 
of genetic drift (possibly enhanced by a bottleneck) or the 
result of a replacement of the older population with new 
colonizers from another population (Nordborg 1998). We 
show that this question can be addressed by considering 
the trajectory of genetic relatedness within a temporal 
sample that spans the time of the putative event. 
Additional hypotheses about population history that are dif- 
ficult to address with genetic data from one or a few time 



points but that can be addressed with wide temporal samples 
include the timing of bottlenecks and transient genetic bar- 
riers. Conventional inference of the timing of population size 
reductions usually requires assumptions about mutation rate 
and/or recombination rate (Ramakrishnan et al. 2005; Voight 
et al. 2005; Li and Durbin 2011; Mourier et al. 2012; Sheehan 
et al. 2013). As illustrated, for example, in figure 5£, the use of 
continuously distributed temporal data allows accurate iden- 
tification of the time of population size reduction that is 
robust to assumptions about mutation and recombination 
rates. For these reasons, ancient genomic data promise to 
advance our understanding of the recent evolutionary history 
of many species. 

Materials and Methods 

To investigate temporal structure under an infinitely many- 
sites mutation model and population structure (see also 
Excoffier and Foil 2011; Skoglund et al. 2011), we developed 
a temporal coalescent simulation algorithm based on 
Hudson's (2002) ms. The idea here is to use the versatility 
of ms to simulate a genealogy but use in-house custom code 
for the mutation process to accommodate different branch 
lengths due to temporal structure. The algorithm proceeded 
as follows: For a sample of size L ancient diploid individuals, 
we instruct the program ms to create 2L isolated subpopu- 
lations and sample a single lineage from each. At the desired 
time t h of each historical sample, each of the 2L subpopula- 
tion is joined (command "-ej") with the appropriate popula- 
tion to which they belong. From the gene tree output of ms 
(command "-T"), we subtract t h from the external branch of 
each ancient sample and add a single mutation on the result- 
ing genealogy with probability equal to branch length 
(Hudson 1990) using custom code. For example, if there is 
one individual to be sampled at time 0.3 and five additional 
individuals at time 0.4, two lineages are joined to the popu- 
lation at time 0.3 and the remaining ten lineages join the 
population at time 0.4. To increase precision, we modified 
the source code of ms to produce 12 decimal digits for each 
branch in the gene tree output. The custom code is available 
upon request. We validated the algorithm by comparison 
with COMPASS (Jakobsson 2009), which allows temporal 
samples but not from multiple populations. Under the 
model in figure 1A, we obtained identical estimates of 
F ST = 0.337 =b 0.001 for both algorithms, as well as highly sim- 
ilar site frequency spectra (supplementary fig. S5, 
Supplementary Material online). For all simulations above, 
except the two-dimensional spatial lattice and the Woolly 
Mammoth analysis (see below), we simulated 2 x 100,000 
independent (unlinked) SNPs for each individual and com- 
bined pairs of lineages to create a diploid genotype for each 
individual. When time is given in years, we assumed a 25-year 
generation time, except in the case of the Woolly Mammoth, 
where we assumed 15 years as in Nystrom et al. (2012). 

F ST was estimated using equation (5.3) in Weir (1996) with 
standard errors estimated using a block jackknife, dropping 
blocks of 1,000 loci in turn. PCA was performed using the 
prcomp function in R 2.11.1 (R Development Core Team 
2010). Except in the case of microsatellites and the three- 
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dimensional stepping-stone model with temporal samples, 
we used the normalization suggested by Patterson et al. 
(2006). For the 44 mammoth individuals in Nystrom et al. 
(2012) that had no missing data for the four microsatellites, 
we considered each unique microsatellite allele to be a sep- 
arate marker, which were given a count of 0, 1, or 2 copies in 
each individual. Maximum-likelihood trees were inferred 
using TreeMix version 1.11 (Pickrell and Pritchard 2012) as- 
suming no migration and using a block size of 1,000 SNPs for 
estimating standard errors. 

To confirm the relationship between temporal structure 
and divergence models, we estimated F ST between two sam- 
ples of 15 diploid individuals each for three simulated demo- 
graphic models with a constant effective population size (N e ) 
(fig. 1). In the first model (A), both samples were from the 
same time point but from two populations that had diverged 
7=0.5 time units into the past (fig. 1A). The second model 
(B) assumed that one sample was 7=0.5 time units older 
than the other and that the two samples were from different 
populations that diverged 7=0.75 time units into the past 
(fig. 1B). The third model assumed a single continuous pop- 
ulation but with one sample 7=1.0 coalescent time units 
(2N e generations) older than the other (fig. 1C). Most impor- 
tantly, in all three models, the total coalescent time that 
passes as one follows the history from one sample to the 
other is 7= 1.0. In all three models, we also estimate F ST to 
approximately 0.33 (0.337 ± 0.001 [±1 standard error], 
0.334 ± 0.001, and 0.335 ± 0.001, respectively). 

To simulate microsatellite data, we implemented a step- 
wise mutation model with /x = 10 -3 for COMPASS 
(Jakobsson 2009) as in Nystrom et al. (2012), where each 
mutation event either (with equal probability) adds or sub- 
tracts one unit from an arbitrarily chosen starting length 
(100). After this simulation, we considered each simulated 
(unique) microsatellite allele as its own marker, which was 
counted as above, and used that information as input for the 
PCA. We used the PC1 loading of each individual as summary 
statistic (in total a vector of 44 summary statistics). We sim- 
ulated 100,000 replicates from which 0.2% of the replicates 
with the smallest Euclidian distance to the empirical PC1 
loadings were used to obtain posterior distributions using 
local linear regression (Beaumont et al. 2002) after log trans- 
formation as implemented in the abc R package (Csillery et al. 
2012). 

To investigate the population topology inferred from 
single individuals, we applied tests that utilize sharing of de- 
rived alleles. D-statistics were computed using a strategy of 
sampling a single haploid gene copy from each population 
(Reich et al. 2009; Durand et al. 2011; Patterson et al. 2012). 
We tested all three possible topologies that could be con- 
structed using four taxa: Population A, population B, the an- 
cient individual, and an outgroup individual (gray symbol in 
fig. 9E and F) that was taken to carry the ancestral allele 
(which is given in the ms simulations). Specifically, the topol- 
ogies tested were (Outgroup, (Ancient, (population A, pop- 
ulation B))); (Outgroup, (population A, (Ancient, population 
B))); and (Outgroup, (population B, (Ancient, population 
A))). For a proposed topology of the form (Outgroup, (J, (V, 



Z))), we denote the count of all observations of a shared 
derived allele ("B") for J and V that is absent from 
Outgroup and Z by "ABBA" (here "B" symbolizes the derived 
state and "A" the ancestral state), and the count of all obser- 
vations of a shared derived allele for J and Z that is absent 
from Outgroup and V by "BABA." The D-statistic is given by 

BABA - ABBA 

D = , (7) 

BABA + ABBA v J 

and a deviation from zero suggest a violation of the proposed 
topology. We computed concordance statistics (Schlebusch 
et al. 2012; Skoglund et al. 2012) using the same data and 
testing the same topologies, but these tests also use the con- 
figuration where Z and V share a derived allele that is absent 
from Outgroup and J, which we denote "AABB." The concor- 
dance statistic is given by 

AABB - max(ABBA, BABA) 

c = 7 1 r , (8) 

AABB + max(ABBA, BABA) v 7 

and positive values of C indicate concordance with the pro- 
posed topology. 

Supplementary Material 

Supplementary material is available at Molecular Biology and 
Evolution online (http://www.mbe.oxfordjournals.org/). 
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